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ABSTRACT 

A well-known theorem usually attributed to Keilson states that, for an irreducible 
continuous-time birth-and-death chain on the nonnegative integers and any d, the pas- 
sage time from state to state d is distributed as a sum of d independent exponential 
random variables. Until now, no probabilistic proof of the theorem has been known. 
In this paper we use the theory of strong stationary duality to give a stochastic proof 
of a similar result for discrete-time birth-and-death chains and geometric random vari- 
ables, and the continuous-time result (which can also be given a direct stochastic proof) 
then follows immediately. In both cases we link the parameters of the distributions to 
eigenvalue information about the chain. We also discuss how the continuous-time result 
leads to a proof of the Ray-Knight theorem. 

Intimately related to the passage-time theorem is a theorem of Fill that any fastest 
strong stationary time T for an ergodic birth-and-death chain on {0, . . . , d} in continuous 
time with generator G, started in state 0, is distributed as a sum of d independent 
exponential random variables whose rate parameters are the nonzero eigenvalues of 
—G. Our approach yields the first (sample-path) construction of such a T for which 
individual such exponentials summing to T can be explicitly identified. 
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1 Introduction and summary 

A well-known theorem usually attributed to Keilson [12] (Theorem 5.1 A, together with 
Remark 5. IB; see also Section 1 of [H]), but which — as pointed out by Laurent Saloff- 
Coste via Diaconis and Miclo [5] — can be traced back at least as far as Karlin and 
McGregor [10} equation (45)], states that, for an irreducible continuous-time birth-and- 
death chain on the nonnegative integers and any d, the passage time from state to 
state d is distributed as a sum of d independent exponential random variables with 
distinct rate parameters. Keilson, like Karlin and McGregor, proves this result by 
analytical (non-probabilistic) means. 

Modulo the distinctness of the rates, and with additional information (see, e.g., [2]) 
relating the exponential rates to spectral information about the chain, the theorem can 
be recast as follows. 

Theorem 1.1. Consider a continuous-time birth- and- death chain with generator G* on 
the state space {0, . . . , d} started at 0, suppose that d is an absorbing state, and suppose 
that the other birth rates A*, < i < d — 1, and death rates \x\, 1 < i < d — 1, are 
positive. Then the absorption time in state d is distributed as the sum of d independent 
exponential random variables whose rate parameters are the d nonzero eigenvalues of 
-G*. 

There is an analogue for discrete time: 

Theorem 1.2. Consider a discrete-time birth- and- death chain with transition kernel P* 
on the state space {0, ...,d} started at 0, suppose that d is an absorbing state, and 
suppose that the other birth probabilities p*, < i < d — 1, and death probabilities q* , 
1 < % < d—1, are positive. Then the absorption time in state d has probability generating 
function 

d-l 

i=o 

where —l<9j<l are the d non-unit eigenvalues of P* . 

In this paper we will give a stochastic proof of Theorem 1 1 . 2 1 under the additional hy- 
pothesis that all eigenvalues of P* are (strictly) positive; as we shall see later (Lemma l2.4p . 
this implies another condition key to our development, namely, that 

P U+q*<l, l<i<d. (1.1) 

Whenever P* has nonnegative eigenvalues, the conclusion of Theorem 11.21 simplifies : 

The absorption time in state d is distributed as the sum of d independent geometric 
random variables whose failure probabilities are the non-unit eigenvalues of P* . 

The special-case of Theorem 11.21 for positive eigenvalues establishes the theorem in 
general by the following argument (which is unusual, in that it is not often easy to relate 
characteristics of a chain to a "lazy" modification). Choose any e € (0, 1/2) and apply 
the special case of Theorem 11.21 to the "lazy" kernel P*(e) := (1 — e)I + eP*. Let T* 



(1 - 0j)u 
1 - 6jU 
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and T*(e) denote the respective absorption times for P* and P*(e). Then T*{e) has 
probability generating function (pgf) 



3=0 



e(l - 0j)s 



1 - (1 -e(l -6»j))s 



(1.2) 



But the conditional distribution of T*(e) given T* is negative binomial with parame- 
ters T* and s, so the pgf of T*(e) can also be computed as 



E/*( £ ) = EE /*W 



E 



1 - (1 - e)s. 

Equating (jl.2p and (jl.3p and letting it := es/[l — (1 — e)s], we find, as desired, 



(1.3) 



Later in this section we explain how Theorem 11.11 follows from Theorem 11.21 but in 
Section \5\ we will also outline a direct stochastic proof of Theorem 11.11 

Remark 1.3. (a) Theorems 1 1 . 1 1 and 1 1 . 2 1 are the starting point of an in-depth consider- 
ation of separation cut-off for birth-and-death chains in [6]. 

(b) By a simple perturbation argument, Theorems 11.11 and 11.21 extend to all birth- 
and-death chains for which the birth rates A* (respectively, birth probabilities p*), < 
i < d — 1, are positive. 

(c) There is a stochastic interpretation of the pgf in Theorem 11.21 even when some 
of the eigenvalues are negative (see (4.23) in [3]), but we do not know a stochastic proof 
(i.e., a proof that proceeds by constructing random variables) in that case. 

(d) The condition (jl.ip is closely related to the notion of (stochastic) monotonicity. 
All continuous-time, but not all discrete-time, birth-and-death chains are monotone. 
In discrete time, monotonicity for a general chain is the requirement that the distri- 
butions P*(i, •) in the successive rows of P* be stochastically nondecreasing, i.e., that 
^2 k> j P*(i, k) be nondecreasing in i for each j. As noted in [3], for a discrete-time 
birth-and-death chain P* , monotonicity is equivalent to the condition 

PU + 9* < 1, 1 < i < d. 

We need only prove the discrete-time Theorem 11.21 (or even just the special case 
where P* has positive eigenvalues), for then given a continuous-time birth-and-death 
generator G* we can consider the discrete-time birth-and-death kernels P*(e) '■= I+eG*, 
where / denotes the identity matrix and e > is chosen sufficiently small that P*{e) is 
nonnegative and has positive eigenvalues. Let T(e) and T denote the absorption times 
for P*(s) and G*, respectively. Then it is simple to check that eT{e) converges in law 
to T; indeed, for any < t < oo we have 

P(eT(e) < t) = (P*(e)) L * /eJ (0,d) {e tG *)(0,d) = P{T < t). 
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But the eigenvalues of P*(e) and of — G* are simply related, and suitably scaled geo- 
metric random variables converge in law to exponentials, so Theorem 11.11 follows imme- 
diately. 

The idea of our proof of Theorem 11.21 is simple: We show that the absorption time 
(call it T*) of P* has the same distribution as T, where T is the absorption time 
of a certain pure-birth chain P whose holding probabilities are precisely the non-unit 
eigenvalues of P*. 

We do this by reviewing (in Section [2]) and then employing the Diaconis and Fill [3] 
theory of strong stationary duality in discrete time. In brief, a given absorbing birth- 
and-death chain P* satisfying (jl.ip is the classical set-valued strong stationary dual 
(SSD) of some monotone birth-and-death chain P with the same eigenvalues; naturally 
enough, we will call P an "anti-dual" of P*. But, if also the eigenvalues of P* are 
nonnegative, then we show that this P (and indeed any ergodic birth-and-death chain 
with nonnegative eigenvalues) in turn also has a pure-birth SSD P whose holding prob- 
abilities are precisely the non-unit eigenvalues of P. Since we argue that both duals are 
sharp (i.e., give rise to a stochastically minimal strong stationary time for the P-chain), 
the absorption time T* of P* has the same distribution as the absorption time T of P, 
and the latter distribution is manifestly the convolution of geometric distributions. 

Remark 1.4. (a) Although our proof of Theorem 11.21 is stochastic, it leaves open [or, 
rather, left open — see part (c) of this remark] the question of whether the absorption 
time itself can be represented as an independent sum of explicit geometric random 
variables; the proof establishes only equality in distribution. The difficulty with our 
approach is that there can be many different stochastically minimal strong stationary 
times for a given chain. 

(b) However, for either of the two steps of our argument we can give sample-path 
constructions relating the two chains (either P* and P, or P and P). This has already 
been carried out in detail for the first step in [3]. For the second step, what this means 
is that we can show how to watch the P-chain X run and contemporaneously construct 
from it a chain X with kernel P in such a way that the absorption time T of P is a 
fastest strong stationary time for X. 

(c) Subsequent to the work leading to the present paper, Diaconis and Miclo [5] gave 
another stochastic proof of Theorem 11.11 Their proof, which provides an "intertwining" 
between the kernels P* and P (in our notation), yields a construction of exponentials 
summing to the absorption time, but the construction is, by their own estimation, "quite 
involved". In a forthcoming paper [9], we will exhibit a much simpler such construction, 
with extensions to skip-free processes. 

Section [2] is devoted to a brief review of strong stationary duality and a proof that 
any discrete-time birth-and-death kernel with positive eigenvalues satisfies (jl.ip . In 
Section [31 we construct P from P*. In Section H] we construct P from P and (in 
Section I4.ip describe the sample-path construction discussed in Remark 11.4( b). In 
Section [5] for completeness we provide continuous-time analogs of our discrete-time 
auxiliary results, which we find interesting in their own right and which combine to give 
a direct stochastic proof of Theorem 11.11 Section [6] shows how to extend Theorem 11.11 
from the hitting time of state d to the occupation-time vector for the states {0, . . . , d— 1} 
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and connects the present paper with work of Kent [13] and the celebrated Ray-Knight 
theorem [171 HI] . 

2 A quick review of strong stationary duality 

The main purpose of this background section is to review the theory of strong stationary 
duality only to the extent necessary to understand the proof of Theorem 11.21 For a 
more general and more detailed treatment, consult [3], especially Sections 2-4. To a 
reasonable extent, the notation of this paper matches that of [4]. Strong stationary 
duality has been used to bound mixing times of Markov chains and also to build perfect 
simulation algorithms [8]. 

2.1 Strong stationary duality in general 

Let X be an ergodic (irreducible and aperiodic) Markov chain on a finite state space; 
call its stationary distribution ir. A strong stationary time is a randomized stopping 
time T for X such that X? has the distribution ir and it independent of T. Aldous 
and Diaconis [TJ Proposition 3.2] prove that for any such X there exists a fastest (i.e., 
stochastically minimal) strong stationary time, although it is well known that such a 
fastest time is not (generally) unique. (Such a fastest time is called a time to stationarity 
in [4], but this terminology has not been widely adopted and so will not be used here.) 

A systematic approach to building strong stationary times is provided by the frame- 
work of strong stationary duality. The following specialization of the treatment in 
Section 2 of [1] (see especially Theorem 2.17 and Remark 2.39 there) is sufficient for our 
purposes. 

Theorem 2.1. Let ttq and 7Tq be probability mass functions on {0, 1, . . . d}, regarded as 
row vectors, and let P, P* , and A be transition matrices on S. Assume that P is ergodic 
with stationary distribution ir, that state d is absorbing for P* , and that the row A(d, ■) 
equals ir. If (ttq,P*) is a strong stationary dual of (iro, P) with respect to the link A in 
the sense that 

7T = vr^A and AP = P*A, (2.1) 
then there exists a bivariate Markov chain (X*,X) such that 

(a) X is marginally Markov with initial distribution ttq and transition matrix P; 

(b) X* is marginally Markov with initial distribution 7Tq and transition matrix P* ; 

(c) the absorption time T* of X* is a strong stationary time for X. 

Moreover, if A(i, d) = for i = 0, . . . , d — 1, then the dual is sharp in the sense that T* 
is a fastest strong stationary time for X . 

Remark 2.2. In both our applications of Theorem 12.11 (Sections 131 and |4|). 

(i) the initial distributions ttq and tTq are both taken to be unit mass 5q at 0, and 
A(0, •) = <5o, too, so only the second equation in (|2.ip needs to be checked; and 
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(ii) the link A is lower triangular, from which we observe that the corresponding dual 
is sharp and (if also the diagonal elements of A are all positive) that, given P, 
there is at most one stochastic matrix P* satisfying (12. ip . namely, P* = APA -1 . 

2.2 Classical (set-valued) strong stationary duals 

Let P be ergodic with stationary distribution ir, and let H denote the corresponding 
cumulative distribution function (cdf): 

Let A be the link of truncated stationary distributions: 

A(x*,x) = 1(x<x*)tt x /H x *. (2.2) 

If P is a monotone birth-and-death chain (more generally, if P is arbitrary and the time 
reversal of P is monotone — see [U Theorem 4.6]), then a dual P* exists [and is sharp 
and unique by Remark 12.2( h)]: 

Theorem 2.3. Let P be a monotone ergodic birth-and-death chain on {0, . . . , d} with 
stationary cdf H . Then P has a sharp strong stationary dual P* with respect to the link 
of truncated stationary distributions. The chain P* is also birth-and-death, with death, 
hold, and birth probabilities (respectively) 

Qi = —n—Pi r i = 1 - (Pi + ft+i)> Pi = —E—Qi+l- (2-3) 
tii Hi 

See Sections 3-4 of [1] for an explanation as to why the dual in Theorem l2.3l is called 
"set- valued" ; in this paper we shall refer to it as the "classical" SSD. The equations (|2.3p 
reproduce [U (4.18)]. 

2.3 Positivity of eigenvalues and stochastic monotonicity for birth- 
and-death chains 

When we prove Theorem 11.21 assuming that P* has positive eigenvalues, we will utilize 
the strengthened monotonicity condition (jl.ip . Part (a) of the following lemma provides 
justification. 

Lemma 2.4. Let P* be the kernel of any birth-and-death chain on {0, . . . ,d}. 

(a) // P* has positive eigenvalues, then (jl.ip holds. 

(b) // P* has nonnegative eigenvalues, then P* is monotone. 

Proof, (a) By perturbing P* if necessary, we may assume that P* is ergodic. Then P* 
is diagonally similar to a positive definite matrix whose principal minor corresponding 
to rows and columns i — 1 and i is r*_ 1 r* — p*_iq*, so 

< rUrt-pUq* < (1- P U)0- ~ it) ~ Pt-iQi =1-pU~ ?i- 
(b) This follows by perturbation from part (a). □ 
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Remark 2.5. Both converse statements are false. For any given d > 2, the condi- 
tion (jl.ip does not imply nonnegativity of eigenvalues, not even for chains P* satisfying 
the hypotheses of Theorem 11.21 An explicit counterexample for d = 2 is 



whose smallest eigenvalue is (26 — \/3026)/100 < 0. For general d > 2, perturb the 
direct sum of this counterexample with the identity matrix. 

3 An anti-dual P of the given P* 

As discussed in Section [Tj the main discrete-time theorem, Theorem 11.21 follows from 
the chief results, Theorems 13.11 and 14.21 of this section and the next. 

Under the strengthened monotonicity condition (|1.1[) (with no assumption here 
about nonnegativity of the eigenvalues), the anti-dual construction of Theorem [37TJ ex- 
hibits the given chain (call its kernel P*) as the classical SSD of another birth-and-death 
chain. 

Theorem 3.1. Consider a discrete-time birth-and-death chain P* on {0, . . . ,d} started 
at 0, and suppose that d is an absorbing state. Write q* , r* , and p* for its death, hold, 
and birth probabilities, respectively. Suppose that p* > for < i < d — 1, that q* > 
for 1 < % < d— 1, and that p*_i + q* < 1 for 1 < i < d. Then P* is the classical (and 
hence sharp) SSD of some monotone ergodic birth-and-death kernel P on {0, . . . , d}. 

Proof. In light of Remark l2.2( i). we have dispensed with initial distributions. The claim 
is that P* is related to some monotone ergodic P with stationary cdf H via (|2.3[) . We 
will begin our proof by defining a suitable function H, and then we will construct P. 

We inductively define a strictly increasing function H : {0, ...,d} — ► (0,1]. Let 
Hd := 1, and define H^-i 6 (0, 1) in (for now) arbitrary fashion. Having defined 
Hd, . . . ,Hi (for some 1 < i < d — 1), choose the value of Hi-\ £ (0, Hi) so that 



this is clearly possible since the right side of (|3.ip is in (0, 1) and the left side, as a 
function of the variable i, decreases from oo at iTj_i = 0+ to at = Hi. It 

is also clear that by choosing Hd-i sufficiently close to 1, we can make all the ratios 
Hi/Hi-i (i = 1, . . . d) as (uniformly) close to 1 as we wish. 
Next, define go := 0, 



p* 



0.50 0.50 
0.49 0.02 0.49 
1 




(3.1) 




(3.2) 



and, for 1 < i < d, 



H; 




(3.3) 



Pi ■= 



Hi-\ 



i i 



ft 



■Pi-v 
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When the iPratios are taken close enough to 1, then for < i < d we have p\ + qi < 1 
and we define 

Ti : = 1 - pi - qi > 0. 

The kernel P with death, hold, and birth probabilities qi, rj, and pi is irreducible and 
aperiodic, and thus ergodic. To complete the proof, will also show 

(a) P is monotone (recall: equivalent to pi + < 1 for < i < d — 1), 

(b) P has stationary cdf -P, and 

(c) P* is the classical SSD of P. 

For (a) we simply observe, using (|3.3[) and (|3.ip . that 

Pi + gi+i = -^it+ -^-p\ = q: + p*i < i (3.4) 

for 1 < i < d — 1; and similarly that 

~ Jh) Po + W Po =Po 

For (b) we observe, again using (|3.3[) and (|3.ip . that the detailed balance condition 
(fli - Hi-Jpi = (Hi - Hi^)-^q* = (H i+1 - Hi)-^p* = (H i+1 - Hi)q l+1 



holds for 1 < % < d - 1; by ([32]) and (j33]) . it also holds for i = 0: 

H oPo = (Pi - ff )^Po = (Hi ~ H ) qi . 
-Hi 

For (c), we simply verify that (|2.3p holds: for < i < d (with P_i := 0), from ()3.3 
and (B3D , 



IT. 7_T 

-Pi = % , ^7-<?i+i =Pi, Pi + Qi+i = Qi +Pi = 1 - r f . □ (3.5) 



tii tii 



Remark 3.2. Once the value of Pld-\ is chosen, the definitions of H and P are forced; 
indeed, if the detailed balance condition and (|3.5p are to hold, then we must have (|3.ip ~ 



4 A pure birth "spectral" dual of P 

In this section we construct a sharp pure birth dual P for any ergodic birth-and-death 
chain P on {0, . . . d} with nonnegative eigenvalues started in state 0. When this con- 
struction is applied in the proof of Theorem 11.21 to the chain P resulting from P* by 
application of Theorem 13. 1\ assuming nonnegativity of the eigenvalues of P* yields the 
required nonnegativity of the eigenvalues of P in Theorem 14.21 indeed, as noted in 
Remark 12.2( h). the matrices P and P* are similar. 



s 



Our construction of the pure birth dual specializes a SSD construction of Matthews 
[15| for general reversible chains with nonnegative eigenvalues; that construction is 
closely related to the spectral decomposition of the transition matrix. For completeness 
and the reader's convenience, and because for birth-and-death chains (a) we can give a 
more streamlined presentation with minimal reference to eigenvectors and (b) we wish 
to establish the new result that the resulting dual is sharp, we do not presume familiarity 
with [T5] . 

To set up our construction we need some notation. Let P be an ergodic birth-and- 
death chain on {0, . . . , d} with stationary probability mass function n (note that ir is 
everywhere positive) and nonnegative eigenvalues, say < 9q < &i < • • • < &d-\ < ®d = 
1. (It is well known [12] [U Theorem 4.20] that the eigenvalues are all distinct, but we 
will not need this fact.) Let / denote the identity matrix and define 

Q k :=(i-Qo)- 1 ---(i-e k - 1 y 1 (p-e i)---(P-e k - 1 r), k = o,...,d, (4.i) 

with the natural convention Qq := /. Note that for k = 0, . . . , d — 1 we have 

QkP = k Q k + (1 - 6 k )Q k+1 . (4.2) 
Lemma 4.1. The matrices Q k are all stochastic, and every row of Qd equals tt. 

Proof. For the first assertion it is clear that the rows of Q k all sum to 1, so the only 
question is whether Q k is nonnegative. But P = D -1 / 2 SD 1 / 2 , where D = diag(-7r) and S 
is symmetric, so the nonnegativity of Q k follows from that of 

s k -.= (s-9 i)---(s-e k ^i), 

which in turn is an immediate consequence of (the rather nontrivial) Theorem 3.2 in [16] 
using only that S is nonnegative and symmetric. 
For the second assertion, write 



S = ^2 OrU r U^ , 



r=0 



where the column vectors uq, . . . ,Ud form an orthogonal matrix and has ith entry 
Then, as noted at (2.6) of [IB] , 



r=k 



k-1 

6 r 



n< 

t=0 



T 

U r U r . 



In particular, Sd = (1 — 0q) ■ ■ ■ (1 — 6d~i)udU^ , so every row of Qd equals n. □ 

Now let 5o denote unit mass at (regarded as a row vector), and define the proba- 
bility mass functions 

X k :=5 Q k , k = 0,...,d. (4.3) 

Let A [so named to distinguish it from the classic link A of (|2.2p ] be the lower-triangular 
square matrix with successive rows Aq, . . . , A^, and define P to be the pure-birth chain 
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which is true because = tt and, for k 



probability Oi at state i for i = 0, . . . , d; 
if 3 = i 

if j = i + 1 (4.4) 
otherwise. 



transition matrix on {0, . 
that is, 



, d} with holding 



Pij 



Theorem 4.2. Let P be an ergodic birth- and- death chain on {0, . . . , d} with nonnegative 
eigenvalues. In the above notation, P is a sharp strong stationary dual of P with respect 
to the link A. 

Proof. We have again dispensed with initial distributions by Remark l2.2l fi). The desired 
equation AP = PA is equivalent to 

A fc P = e k X k + (1 - 9 k )X k+1 , k = 0, . . . ,d - 1; \ d P = X d , 

= 0,...,d-l, 



AfcP = o~oQ k P = 9 k X k + (1 — 9 k )\ k+ i 
by (|4.2p . The SSD is sharp because A is lower triangular; recall Remark 12.2( h). □ 

Remark 4.3. Lemma 14.11 is interesting and, as we have now seen, gives rise to the 
construction of a new "spectral" SSD for a certain subclass of monotone birth-and-death 
chains, namely, chains with nonnegative eigenvalues [recall Lemma 12.4( b)]. But for the 
proof of Theorem 11.21 one could make do without the nonnegativity of the matrix A by 
taking the approach of Matthews [15] and considering the chain P started in a suitable 
mixture of 5q and the stationary distribution tt. We omit further details. 

4.1 Sample-path construction of the spectral dual 

Let X be an ergodic birth-and-death chain on {0, . . . , d} with kernel P having nonnega- 
tive eigenvalues, assume Xq = 0, and let T be any fastest strong stationary time for X. 
Independent of interest in Theorems 11.11 and 11.21 Theorem 14.21 gives the first stochastic 
interpretation of the individual geometries in the representation of the distribution of T 
as a convolution of geometric distributions. In this subsection we carry this result one 
step further by showing how to construct, sample path by sample path, a particular 
fastest strong stationary time T which is the sum of explicitly identified independent 
geometric random variables. 

The idea is simple. Theorem 14.21 shows that P of (|4,4p is an "algebraic" dual of P 
in the sense that the matrix-equation AP = PA holds. But whenever algebraic duality 
holds for any finite-state ergodic chain with respect to any link (A in our case), Sec- 
tion 2.4 of [4] shows explicitly how to construct, from X and independent randomness, a 
dual Markov chain (X in our case, with kernel P) such that the absorption time T of X 
is a strong stationary time for X; since A is lower triangular, T will be stochastically 
optimal. So to describe our construction of X (and hence T) we need only specialize 
the construction of [U Section 2.4] [see especially (2.36) there]. 
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The chain X starts with Xq = and we set Xq = 0. Inductively, we will have 
A(Xt,Xf) > (and so Xt < Xt) at all times t. The value we construct for Xt depends 
only on the values of Xt-i and Xt and independent randomness. Indeed, given Xf-\ = x 
and Xt = y, if y < x then our construction sets Xt = x + 1 with probability 

P(x, x + l)A(x + 1, y) _ (1 - fla)A(a + 1, y) _ (1 - fla)Qa+i(0, y) (4 g) 

(PA)(x, y) " 4 A(x, y) + (1 - 4 )A> + 1, y) " (Qx^XO, y) 

and = x with the complementary probability; if y = x + 1 (which is the only other 
possibility, since y = Xt < Xt-i + 1 < x + 1 by induction), then we set Xt = x + 1 with 
certainty. 

The independent geometric random variables, with sum T, are the waiting times 
between successive births in the chain X we have built. Thus it is no longer true 
that the individual geometric distributions "have no known interpretation in terms of 
the underlying [ergodic] birth and death chain" [6J Section 4, Remark 1]; likewise, for 
continuous time consult Section 15.11 herein. 



Example 4.4. Consider the well-studied Ehrenfest chain, with holding probability 1/2: 

i 1 d — i 

qi =2d> r * = 2' P * = ^2d-> « = 0,...,d. 

The eigenvalues are 6% = i/d. A straightforward proof by induction using (|4.3p and (|4.2p 
confirms that Xk is the binomial distribution with parameters k and 1/2: 

A(x,x)=Q2~ £ . (4.6) 

Thus the probability (|4,5p reduces to 

(d-x)(x + 1) 



2x(x + l-y) + (d-x)(x + l)' 

The chain we have described lifts naturally to random walk on the set Zf[ of binary 
(i-tuples whereby one of the d coordinates is chosen uniformly at random and its entry 
is then replaced randomly by or 1. It is interesting to note that the sharp pure- 
birth SSD chain constructed in this example does not correspond to the well-known 
"coordinate-checking" sharp SSD (see Example 3.2 of [1]). Indeed, expressed in the 
birth-and-death chain domain, the coordinate-checking dual is a pure-birth chain, call 
it X', such that the construction of X[ depends not only on X[_ x and X t but also 
on Xt-\. The construction rules are that if X[_ l = x, X t -\ = x, and X t = y, then X[ 
is set to x + 1 with probability 

if y = x - 1, 1 - | if y = x, and jj=§ if y = x + 1, 

and otherwise X[ holds at x. Both duals correspond to the same link (14.60 and the 
(marginal) transition kernels for X and X' are the same, but the bivariate constructions 
of (X,X) and (X',X) are different. 
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The freedom for such differences was noted in [H Remark 2.23(c)] and exploited in 
the creation of an interruptible perfect simulation algorithm (see Remark 9.8]). In 
fact, X' (when lifted to Zg) corresponds to the construction used in [8]. An advantage of 
the X-construction of the present paper is that it allows (both in our Ehrenfest example 
and in general) for holding probabilities that are arbitrary (subject to nonnegativity of 
eigenvalues); in the paragraph containing (|4.5p . all that changes when a weighted average 
of the transition kernel and the identity matrix is taken are the eigenvalues 6q, . . . , 9d-i- 

5 Continuous- time analogs of other results 

As discussed in Section [IJ the continuous-time Theorem 11.11 follows immediately from 
the discrete-time Theorem [L2j Another way to prove Theorem 1 1.1 1 is to repeat the proof 
of Theorem 1 1.2 1 by establishing continuous-time analogs (namely, the next three results) 
of the auxiliary results (Theorem 13.11 Lemma 14.11 and Theorem 14. 2|) in the preceding 
two sections; we find these interesting in their own right. The continuous-time results 
are easy to prove utilizing the continuous-time SSD theory of [7j, either by repeating the 
discrete-time proofs or by applying the discrete-time results to the appropriate kernel 
P*(e) = I + sG* or P(e) = I + eG, with e > chosen sufficiently small to meet the 
hypotheses of those results; so we state the results without proof. 

In Section [5.11 we will present the analog of Section [4.11 for continuous time. 

Here, first, is the analog of Theorem 13.11 

Theorem 5.1. Consider a continuous-time birth- and- death chain with generator G* on 
{0, . . . , d} started at 0, and suppose that d is an absorbing state. Write \i\ and A* for 
its death and birth rates, respectively. Suppose that A* > for < i < d — 1 and that 
[i* > for 1 < i < d — 1. Then G* is the classical set-valued (and hence sharp) SSD of 
some ergodic birth- and- death generator G on {0, . . . , d}. 

To set up the second result we need a little notation. Let G be the generator of a 
continuous-time ergodic birth-and-death chain on {0, . . . , d} with stationary probability 
mass function ir and eigenvalues z^o > v\ > • • • > Vd-i > v d — for —G. (Again, we 
don't need the fact [12] that the eigenvalues are distinct.) Define 

Q k :=u 1 ---^\(G + v Q I)---{G + u k ^ l I), k = 0,...,d, (5.1) 

with the natural convention Qq := I. 

Lemma 5.2. The matrices Qk are all stochastic, and every row of Qd equals ir. 

Now define A in terms of the Qfc's as in the paragraph preceding Theorem 14.21 
and let G be the pure-birth generator on {0, . . . , d} with birth rate Vi at state i for 
i = 0, . . . , d. 

Theorem 5.3. Let G be the generator of an ergodic birth-and-death chain on {0, . . . , d\. 
In the above notation, G is a sharp strong stationary dual of G with respect to the link A: 



AG = GA. 
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5.1 Sample-path construction of the continuous-time spectral dual 

Let X be an ergodic continuous-time birth-and-death chain on {0, . . . , d}, adopt all the 
notation of Section [5] thus far, and assume Xq = 0. In this subsection by a routine 
application of Section 2.3 of [7J we give a simple sample-path construction of a "spectral 
dual" pure birth chain X with generator G as described just before Theorem 15.31 its 
absorption time T is then a fastest strong stationary time for X and the independent 
exponential random variables with sum T are simply the waiting times for the successive 
births for X. We thus obtain a stochastic proof, with explicit identification of individual 
exponential random variables, of Theorem 5 in [7]. 

The chain X starts with X(0) = and we set X(0) = 0. Let n > 1 and suppose 
that X has been constructed up through the epoch r n _i of the (n — l)st transition for 
the bivariate process (X, X); here To := 0. We describe next, in terms of an exponential 
random variable V, how to define r n and X(r n ); we will have A(X(r n ), X(r n )) > and 
hence X(r n ) < X{r n ). Write (x,x) for the value of (X,X) at time r n _i; by induction 
we have A(x,x) > 0. 

Let V n be exponentially distributed with rate 

r = v £ A(x + l,x)/A(x,x), (5.2) 

independent of Vi, . . . , V n ~\ and the chain X. Consider two (independent) exponential 
waiting times begun at epoch r n -\: a first for the next transition of the chain X, and a 
second with rate r. How we proceed breaks into two cases: 

(i) If the first waiting time is smaller than the second, then r n is the epoch of this 
next transition for X and we set X(r n ) = x = X{r n -\) (with certainty) except in 
one circumstance: if X{r n ) = x + 1, then we set X(r n ) = x + 1, too. 

(ii) If the second waiting time is smaller, then r n = T n _i+V n and we set X{r n ) = x+1. 

Example 5.4. Consider the continuous-time version of the Ehrenfest chain with death 
rates /Xj = i and birth rates Aj = d — i, < i < d; the eigenvalues are Vi = 2{d — i). 
Then A is again the link (14. 6h of binomial distributions, and the rate (|5.2p reduces to 

(d-x)(x + 1) 
r = — • 

x + 1 — x 

6 Occupation times and connection with Ray— Knight The- 
orem 

Our final section utilizes work of Kent [13]; see the historical note at the end of Sec- 
tion 1 of [5] for closely related material. We show how to extend the continuous-time 
Theorem 11.11 from the hitting time of state d first to the occupation-time vector for the 
states {0, . . . , d— 1} and then to the the local time of Brownian motion, thereby proving 
the Ray-Knight theorem [TTJ [14] . 
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6.1 From hitting time to occupation times 

Consider a continuous-time irreducible birth-and-death chain with generator G*. It is 
then immediate from the Karlin-McGregor theorem (Theorem I l.ip that the hitting time 
T* of state d has Laplace transform 

uT * _ det(-Go) 
Ee -det(-G + «/)' (6 ' 1} 

with Go obtained from G* by leaving off the last row and column. 

Equation (|6.ip gives the distribution of the total time elapsed before the chain hits 
state d. But how is that time apportioned to the states 0, . . . , d — 1? This question 
can be answered from (|6.1I) using a neat trick of Kent [13] [see the last sentence of his 
Remark (1) on page 164]. To find the multivariate distribution of the occupation-time 
vector T = (Tq,Ti,... , T<j_i), where Tj denotes the occupation time of (i.e., amount 
of time spent in) state i, it of course suffices to compute the value Ee~f u,T ' of the 
Laplace transform for any vector u = (uq, . . . ,Ud~i) with strictly positive entries. But 
the distribution of the random variable (u, T) = ^ u{Ti is that of the time to absorption 
for the time-changed generator G u (say) obtained by dividing the ith row of G* by Ui 
for i = 0, . . . , d — 1. Therefore, by (|6.ip and the scaling property of determinants, 

Ee -<u,T> = det(-G u ) = det(-Gp) 

det(-G u + /) det(-G + U) ' 

where U := diag(u , . . . , 



6.2 From occupation times to the Ray— Knight Theorem 

Call the stationary distribution ir. Then the matrix S := D{—Gq)D~ 1 is (strictly) 
positive definite, where D := diag(y / 7r). Let E := ^5* _1 . By direct calculation, T has 
the same law as Y + Z, where Y and Z are independent random vectors with the same 
law and Y is the coordinate- wise square of a Gaussian random vector V ~ N(0, £). 

Kent |13| uses and extends this "double derivation" of £(T) to prove the theorem 
of Ray [T7] and Knight [TJ] expressing the local time of Brownian motion as the sum 
of two independent 2-dimensional Bessel processes (i.e., as the sum of two independent 
squared Brownian motions). 

Acknowledgments. We thank Persi Diaconis for helpful discussions, and Raymond 
Nung-Sing Sze and Chi-Kwong Li for pointing out the reference |16| . 
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